Feat/voyageai: adding voyageai integration#4070
Feat/voyageai: adding voyageai integration#4070fzowl wants to merge 9 commits intosimstudioai:mainfrom
Conversation
…nnection string support - Add VoyageAI tools: embeddings (voyage-3, voyage-3-large, etc.) and rerank (rerank-2, rerank-2-lite) - Add VoyageAI block with operation dropdown (Generate Embeddings / Rerank) - Add VoyageAI icon and register in tool/block registries - Enhance MongoDB with connection string mode for Atlas (mongodb+srv://) support - Add connection mode toggle to MongoDB block (Host & Port / Connection String) - Update all 6 MongoDB API routes to accept optional connectionString - Add 48 unit tests (VoyageAI tools, block config, MongoDB utils)
…geAI and MongoDB - Expand VoyageAI tool tests: metadata, all models, edge cases, error codes (60 tests) - Expand VoyageAI block tests: structure, subBlocks, conditions, params edge cases (44 tests) - Expand MongoDB utils tests: connection modes, URI building, all validators (56 tests) - Add live integration tests: embeddings (7 models/scenarios), rerank (5 scenarios), e2e workflow - Integration tests use undici to bypass global fetch mock - Tests skip gracefully when VOYAGEAI_API_KEY env var is not set
- Add voyage-4-large, voyage-4, voyage-4-lite embedding models - Add voyage-3.5, voyage-3.5-lite embedding models - Add rerank-2.5, rerank-2.5-lite reranking models - Default embeddings model: voyage-3.5 - Default rerank model: rerank-2.5 - All models verified working with live API
…tegration - New tool: voyageai_multimodal_embeddings using voyage-multimodal-3.5 model - New API route: /api/tools/voyageai/multimodal-embeddings for server-side file handling - Supports text, image files/URLs, video files/URLs in a single embedding - Uses file-upload subBlocks with basic/advanced mode for images and video - Internal proxy pattern: downloads UserFiles via downloadFileFromStorage, converts to base64 - URL validation via validateUrlWithDNS for SSRF protection - 14 new unit tests (tool metadata, body, response transform) - 5 new integration tests (text-only, image URL, text+image, dimensions, auth) - 8 new block tests (multimodal operation, params, subBlocks)
Remove non-TSDoc separator comments, fix relative import in barrel export, fix any types, and apply biome formatting fixes.
Reverts MongoDB Atlas connection string support due to validation issues in the Zod schemas. VoyageAI integration remains intact.
PR SummaryMedium Risk Overview Registers three new tools ( Adds a new Reviewed by Cursor Bugbot for commit 6394764. Bugbot is set up for automated code reviews on this repo. Configure here. |
|
@fzowl is attempting to deploy a commit to the Sim Team on Vercel. A member of the Team first needs to authorize it. |
️✅ There are no secrets present in this pull request anymore.If these secrets were true positive and are still valid, we highly recommend you to revoke them. 🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request. |
Greptile SummaryThis PR adds a VoyageAI integration covering text embeddings, multimodal embeddings (images + video via an internal proxy route), and document reranking, with matching block, registry entries, and a comprehensive test suite. Two issues need attention before merge:
Confidence Score: 3/5Two P1 issues — canonicalParamId constraint violation and missing response.ok guards — should be fixed before merging. The canonicalParamId === id violation is a documented critical rule that may cause the canonical param transformation layer to drop file input values at runtime. The missing response.ok check means real API errors (invalid key, rate limit) produce opaque TypeErrors rather than actionable messages. Both are present on the changed code paths and need resolution. apps/sim/blocks/blocks/voyageai.ts (canonicalParamId constraint), apps/sim/tools/voyageai/embeddings.ts and rerank.ts (response error handling)
|
| Filename | Overview |
|---|---|
| apps/sim/blocks/blocks/voyageai.ts | New VoyageAI block with embeddings, multimodal embeddings, and rerank operations; canonicalParamId equals the subblock id for imageFiles and videoFile, violating the documented constraint. |
| apps/sim/tools/voyageai/embeddings.ts | Text embeddings tool; transformResponse does not check response.ok, so API errors produce a cryptic TypeError instead of the actual error message. |
| apps/sim/tools/voyageai/rerank.ts | Rerank tool; same missing response.ok guard as embeddings.ts, plus truncation param declared in types but not used here. |
| apps/sim/tools/voyageai/types.ts | Type definitions for all three operations; truncation is declared on both VoyageAIEmbeddingsParams and VoyageAIRerankParams but never wired into any tool or block. |
| apps/sim/app/api/tools/voyageai/multimodal-embeddings/route.ts | Internal proxy route for multimodal embeddings; properly validates input with Zod, uses checkInternalAuth, validates URLs with DNS, and handles all file/URL content types correctly. |
| apps/sim/tools/voyageai/multimodal-embeddings.ts | Multimodal embeddings tool; routes through the internal proxy (correct pattern for file handling), response transformation delegates to the route's structured output. |
| apps/sim/tools/voyageai/voyageai.test.ts | Comprehensive unit tests for all three tools; uses vi.resetAllMocks() in afterEach which conflicts with the project's testing guidelines. |
| apps/sim/tools/voyageai/voyageai.integration.test.ts | Integration tests skipped without VOYAGEAI_API_KEY; uses undici to bypass global fetch mock correctly, covers embeddings, rerank, and multimodal scenarios. |
| apps/sim/blocks/blocks/voyageai.test.ts | Block-level unit tests with solid coverage of subBlock structure, tool routing, and params mapping; all assertions look correct. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[VoyageAI Block] --> B{Operation}
B -->|embeddings| C[embeddingsTool]
B -->|rerank| D[rerankTool]
B -->|multimodal| E[multimodalEmbeddingsTool]
C -->|POST direct| F[VoyageAI Embeddings API]
D -->|POST direct| G[VoyageAI Rerank API]
E -->|POST proxy| H[Internal Multimodal Route]
H --> I{Content type}
I -->|Text| J[content: text]
I -->|imageFiles| K[base64 encode via storage]
I -->|imageUrls| L[validateUrlWithDNS]
I -->|videoFile| M[base64 encode via storage]
I -->|videoUrl| N[validateUrlWithDNS]
J & K & L & M & N --> O[VoyageAI Multimodal API]
F & G & O --> P[embeddings / results / usage]
Reviews (1): Last reviewed commit: "revert: drop all MongoDB connection stri..." | Re-trigger Greptile
- Rename imageFiles/videoFile subblock IDs to avoid canonicalParamId collision - Add response.ok guard in embeddings and rerank transformResponse - Remove unused truncation param from types - Fix test pattern: use beforeEach/clearAllMocks instead of afterEach/resetAllMocks - Add Array.isArray guard for JSON.parse of imageUrls
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 80e0099. Configure here.
4962ed6 to
6394764
Compare

Summary
Brief description of what this PR does and why.
Fixes #(issue)
Type of Change
Testing
I added unit tests, integration tests and also tested manually.
Checklist
Screenshots/Videos